Learning Partially Observable Action Schemas
نویسندگان
چکیده
We present an algorithm that derives actions’ effects and preconditions in partially observable, relational domains. Our algorithm has two unique features: an expressive relational language, and an exact tractable computation. An actionschema language that we present permits learning of preconditions and effects that include implicit objects and unstated relationships between objects. For example, we can learn that replacing a blown fuse turns on all the lights whose switch is set to on. The algorithm maintains and outputs a relationallogical representation of all possible action-schema models after a sequence of executed actions and partial observations. Importantly, our algorithm takes polynomial time in the number of time steps and predicates. Time dependence on other domain parameters varies with the action-schema language. Our experiments show that the relational structure speeds up both learning and generalization, and outperforms propositional learning methods. It also allows establishing aprioriunknown connections between objects (e.g. light bulbs and their switches), and permits learning conditional effects in realistic and complex situations. Our algorithm takes advantage of a DAG structure that can be updated efficiently and preserves compactness of representation.
منابع مشابه
Learning Partially Observable Action Models
In this paper we present tractable algorithms for learning a logical model of actions’ effects and preconditions in deterministic partially observable domains. These algorithms update a representation of the set of possible action models after every observation and action execution. We show that when actions are known to have no conditional effects, then the set of possible action models can be...
متن کاملLearning Partially Observable Action Models: Efficient Algorithms
We present tractable, exact algorithms for learning actions’ effects and preconditions in partially observable domains. Our algorithms maintain a propositional logical representation of the set of possible action models after each observation and action execution. The algorithms perform exact learning of preconditions and effects in any deterministic action domain. This includes STRIPS actions ...
متن کاملApprenticeship Learning for Model Parameters of Partially Observable Environments
We consider apprenticeship learning — i.e., having an agent learn a task by observing an expert demonstrating the task — in a partially observable environment when the model of the environment is uncertain. This setting is useful in applications where the explicit modeling of the environment is difficult, such as a dialogue system. We show that we can extract information about the environment m...
متن کاملRobot Navigation in Partially Observable Domains using Hierarchical Memory-Based Reinforcement Learning
In this paper, we attempt to find a solution to the problem of robot navigation in a domain with partial observability. The domain is a grid-world with intersecting corridors, where the agent learns an optimal policy for navigation by making use of a hierarchical memory-based learning algorithm. We define a hierarchy of levels over which the agent abstracts the learning process, as well as it...
متن کاملDemonstration of a POMDP Voice Dialer
This is a demonstration of a voice dialer, implemented as a partially observable Markov decision process (POMDP). A realtime graphical display shows the POMDP’s probability distribution over different possible dialog states, and shows how system output is generated and selected. The system demonstrated here includes several recent advances, including an action selection mechanism which unifies ...
متن کامل